PROTOCOL: Effects of guaranteed basic income interventions on poverty‐related outcomes in high‐income countries: A systematic review

Abstract This is the protocol for a Campbell systematic review. The objectives are as follows: to appraise and synthesize the available quantitative evidence on GBI interventions in high‐income countries, for the purpose of comparing the relative effectiveness of specific forms of GBI for alleviating poverty.


| Poverty in high-income countries
Although the concept of poverty in high-income countries seems like a contradiction in terms, there are nonetheless many people in these countries who rely on social assistance benefits, subsidized housing, donated clothing, and food banks to make ends meet. The incongruity of experiencing poverty in countries that are considered to be wealthy can be explained in part by the definition of a highincome country: one that has a gross national income (GNI) per capita of US$12,696 or more (World Bank, 2022). Since the per capita amount is calculated by dividing the gross national income by the country's population, it doesn't provide any information on the distribution of the income within the population or indicate how many of its citizens are unable to afford a basic standard of living.
While it is expected that some people in the free-market economies of high-income countries will earn more money than others, income inequality has increased in most developed countries since 1990 (United Nations, 2020). Also, the proportion of the population in the middle-income class (having 75%-200% of the national median household income) has declined since the mid-1980s in most developed countries, while the size of the lower-income class (below 75% of the national median household income) has grown in most (OECD, 2019). In contrast, due to strong economic growth in developing countries in the last two decades, the size of the global middle class has nearly doubled or tripled in that time, depending on the measure used (Versace, 2021). One factor in these divergent trends between higher-income and lower-income countries is the outsourcing of manufacturing by developed countries in recent decades, combined with technological advancement that has displaced routine-based jobs, while increasing computing power and artificial intelligence is also placing non-routine jobs at risk (OECD, 2021a).
Campbell Systematic Reviews. 2022;18:e1281. wileyonlinelibrary.com/journal/cl2 According to the International Labour Organization (ILO), 22% of people in developed countries (more than 300 million) were considered poor in 2012, with an income of less than 60% of the national median-and since then, various indicators have shown poverty rates to be either unchanged or, in the case of the European Union (EU), trend higher after the 2008 global financial crisis (ILO, 2016). Based on the poverty threshold of the Organisation for Economic Co-operation and Development (OECD), which is 50% of the national household median income, the poverty rates in developed countries have remained fairly stable between 2008 and 2019, ranging from 5.6% in Czechia (Czech Republic) to 18% in the United States (OECD, 2022). This data also shows the poverty rate for children (0-17 years old) in the United States and Spain to be the highest among developed countries, at 21%. (It should be noted that all the figures above refer to relative poverty, based on median incomes in these countries, and not to absolute poverty which is associated with problems such as malnutrition, unsafe drinking water, and lack of basic education; Peer, 2021).
Considering the basic material needs of food and shelter can also shed light on the prevalence of poverty, and these needs are unmeteither temporarily or chronically-for many people in high-income countries. Because homelessness involves complex underlying factors besides not being able to afford housing, such as addictions, abusive relationships, and mental illness, this experience of poverty is outside the scope of this review, but has been addressed in others (e.g., Aubry, 2020;Nilsson, 2019). Inadequate access to food, on the other hand, is directly related to financial means in high-income countries, as reflected in commonly used definitions of food insecurity: "a lack of available financial resources for food at the household level" (Hunger & Health, 2022), "[not] having physical and economic access to sufficient healthy food at all times" (Department for Environment, Food & Rural Affairs, 2021), and "the inadequate or insecure access to food because of financial constraints" (Tarasuk & Mitchell, 2020).
Over the past five decades, there has been a proliferation of food banks ("food pantries" in the United States) in all high-income countries; however, because of their dependence on charitable donations, food banks are limited in their capacity to alleviate food insecurity (Loopstra, 2018). The prevalence of food banks in highincome countries is an important factor in relation to poverty because the people who rely on food banks for assistance are typically in the most food-insecure categories (moderately or severely foodinsecure) and have lower incomes than food-insecure people who do not rely on food banks ).

| Policies and programs for reducing poverty
Social justice advocates have long asserted that poverty reduction is a moral obligation of the state which can be achieved by a fairer distribution of wealth (Barder, 2009;Standing, 2019). Although various types of support have been provided by the state to people in poverty since ancient times, the modern concept of social welfare emerged in the late 19th century in Germany under Chancellor von Bismarck, based on the precept that people facing poverty and distress should receive assistance from the state, not as a matter of charity but as a right (Rose, 1985). Other high-income countries followed suit during the 20th century, implementing social assistance programs to alleviate poverty after the Great Depression (Trattner, 2007). In the United Kingdom during the Second World War, economist Sir William Beveridge wrote a report for the government which called for a "revolution" in the direction of Britain's welfare state and laid out a comprehensive set of social assistance programs, ranging from child benefits to pensions and funeral allowances. The Beveridge Report expanded on programs introduced by Lloyd George and Churchill three decades earlier and provided the blueprint for modern welfare in the United Kingdom (Day, 2017;Wheeler, 2015). Similarly, the Marsh Report of 1943 provided the foundation for the current social security system in Canada, by proposing measures similar to Beveridge's (a mentor of Marsh) and adding elements such as an employment program and health care insurance (Policy Options, 2004).
The cost of social assistance programs in high-income countries is equivalent to between 12% and 31% of the gross domestic product (GDP), depending on the country (OECD, 2020). The generosity of social assistance also varies over time, with cutbacks being common during economic recessions due to politicians being pressured to support workers not "shirkers" (Romano, 2015).
Social welfare programs were found to reduce poverty significantly in high-income countries between 1960 and 1991 (Kenworthy, 1999). Since then, however, welfare reforms-often called "workfare" because of their emphasis on transitioning social assistance recipients into the workforce-have been blamed by critics for reversing the poverty reduction trend by cutting benefits to the unemployed, including single mothers, and requiring them to accept precarious, low-paying jobs (Carey & Bell, 2020;Widerquist et al., 2013). The increased conditionality of workfare may also result in additional stigma and shame for recipients who either remain unemployed, or those who are skilled and placed in menial, lowpaying jobs (Carey & Bell, 2020;Widerquist, 2013). Sanctions in the form of benefit cuts and interruptions are intended to increase compliance with the conditions of workfare programs (e.g., actively seeking work), but some studies have suggested that these sanctions can have detrimental effects on mental and physical health, debt, material hardship, and financial stress (Pattaro, 2022).
Because social assistance programs rely on a minimum income threshold to determine eligibility, transitioning to a low-paying job with an income slightly above the threshold results in losing the benefit. Additionally, it may also mean losing in-kind benefits such as a rent subsidy and dental care, so a person's net income may end up being even lower than the amount provided by social assistance (Wolfson, 2018).
A distinguishing feature of social assistance in many high-income countries is the availability of various programs, offered by different levels of government and targeted at specific groups (e.g., people with disabilities, women with infant children) and specific needs (e.g., money for food or rent). This approach has been criticized as being a patchwork of programs that are confusing in terms of understanding eligibility criteria, and which fail to provide some categories of people with a subsistence-level income (Koebel & Pohler, 2019;Wolfson, 2018). The complexity of the programs and uncertainty regarding eligibility also translates into high levels of nontake-up, which results in many people missing out on benefits that they are eligible to receive. Although non-take-up results in shortterm savings for the government, it may result in more costly downstream effects if it prevents people from affording early medical treatment or paying for a better education for their children (Van Mechelen & Janssens, 2017).
The United Kingdom introduced a welfare reform called Universal Credit (UC) in 2012, which consolidates six previously separate programs (Winchester, 2021). To be eligible for UC, most recipients who are unemployed (except those with infant children) have to seek work or take training courses, and noncompliance such as missing an appointment with a work coach can lead to sanctions (UK Government, 2014). Some studies also suggest that the reforms of UC have led to an increase in poverty for single mothers (Carey & Bell, 2020).
One type of supplementary social assistance offered in many high-income countries is in the form of refundable (or payable) tax credits, which provide cash benefits to eligible people with low incomes who file income tax returns. However, this form of income supplement has been criticized as being insufficient, especially for people with low incomes and without children (Koebel & Pohler, 2019

| Universal basic income (UBI)
UBI has been proposed as a way to alleviate poverty (Hasdell, 2020) and to replace the current assortments of social assistance programs in high-income countries, administered by different levels of government, which have been described as bureaucratic, costly, and stigmatizing (Koebel & Pohler, 2019;Reed & Lansley, 2016). UBI is "an income paid by a political community to all its members on an individual basis, without means test or work requirement" (Van Parijs, 2004, p. 8). More recently, additional dimensions of UBI have been specified: it is paid at regular intervals and as cash payments which recipients can spend in any way they choose (BIEN, 2020). The amount of the UBI payment should also be stable and predictable (Standing, 2021).
Proponents of UBI have criticized the reformed welfare programs of the past three decades as being fiscally unsustainable, overly intrusive, and inhibiting the agency of benefit recipients (Orrell, 2021). In terms of public opinion, a study in the United Kingdom and the United States found that the two main reasons cited in support of UBI were simplicity and efficiency of administration, and reducing stress and anxiety (Nettle et al., 2021).
Other important implications for UBI pertain to inequalities across socioeconomic status, race, ethnicity, and gender. Stressors such as financial difficulties, caring for disabled children or parents, and abusive relationships at work or home have damaging effects on mental and physical health, and these effects disproportionally impact women, racial/ethnic minorities, and people with low incomes (Thoits, 2010).
For women, UBI paid on an individual basis could potentially improve several areas of concern. Firstly, UBI would provide an income for women who perform work outside the formal labor Although UBI could potentially reduce income inequality along racial lines, there has not been much recent policy discussion on this topic (Bidadanure, 2019).
UBI is, however, receiving renewed attention due to rising income inequality and the changing nature of work due to automation and reductions in the quantity and quality of jobs (Gentilini, 2020;Hasdell, 2020

| Measuring poverty
Regardless of the type of poverty reduction approach that is implemented, a major challenge is evaluating the effectiveness of the approach. This is because a standardized method does not exist for measuring poverty-indeed, there has been considerable debate over which poverty indicators are most accurate and reliable (Cutillo, 2020;Meyer & Sullivan, 2012 are still commonly used and have been criticized as being outdated and that they measure income inequality, not poverty (Gupta & Theoharis, 2020;Konle-Seidl, 2021). The Organisation for Economic Co-operation and Development, for example, defines the poverty line as "half the median household income of the total population" in each country (OECD, 2021b). Because of the arbitrary poverty threshold of such measures, millions of people slightly above the poverty line live precariously-"just a $400 emergency away from poverty" (Gupta & Theoharis, 2020 (Cutillo, 2020). Similarly, a comparison of poverty measures in the United States, including the official poverty measure (OPM), found that a consumption-based measure was more accurate in identifying people who were facing financial hardship-that is, low consumption was a better indicator than low income (Meyer & Sullivan, 2012). Consumption-based measures can also identify those with incomes above the official poverty line, but who spend a large amount on health-related expenses, which may cause difficulty in affording food and rent (Sarabia, 2016). nutrition, but also with social, developmental, and health impacts that may persist into adulthood (Ramsey, 2011;Thomas, 2019).
Food insecurity has been proposed as a more accurate and sensitive indicator of poverty than measures based on income and estimates of the cost of living (Loopstra & Tarasuk, 2013;Power, 2016). Loopstra and Tarasuk observed a linear relationship between the severity of food insecurity and the odds of experiencing hardships such as not being able to pay rent and bills on time.
To examine the relationships of various types of material deprivation, Toppenberg (Toppenberg, 2017)  In this review, we will examine basic income interventions for reducing poverty, assessed using traditional income-based poverty measures as well as alternative and novel measures-based on food insecurity, consumption, material deprivation, subjective financial stress, and other physical, social, and psychological dimensions of poverty that are reported in studies-to assess and compare the effectiveness of different variants of a guaranteed basic income.

| The intervention
A truly universal basic income policy has never been implemented in high-income countries (Gentilini, 2020;Gibson, 2020). Thus, our review will examine basic income interventions which include some features of UBI, as described below. These quasi-UBI approaches are known by various terms such as: basic income guarantee (BIG), guaranteed annual income (GAI), unconditional cash transfer (UCT), and negative income tax (NIT). All of these variations share the common attribute of monetary benefits that would be guaranteed by the state (Van Parijs & Vanderborght, 2017), so we will use the term "guaranteed basic income" (GBI) in this review to cover all types of basic income interventions. The shorter term "basic income" is often used in the literature as a short form of "universal basic income"; therefore, we will use the term "guaranteed basic income" (GBI) to avoid confusion. For the meaning of basic, we will use the two interpretations outlined by Hoynes and Rothstein (Hoynes & Rothstein, 2019): (1) an amount sufficient to pay for one's basic needs, or (2) an amount given to each recipient that provides a base which can be supplemented by other forms of income.
We will also define the 'regular' and 'predictable' payment criteria of GBI as being paid at least once per year and in the same amount each time. Although these are not always considered core criteria of a basic income, we consider predictable, regular payments of a fixed amount to be essential if GBI is used as an intervention to reduce poverty. Not knowing if the next payment will cover the same expenses as the previous one may cause anxiety and apprehension for the recipient, which could aggravate the experience of poverty.
Because some programs, described as a type of basic income, are based on dividends which can change in amount over time (e.g., from oil or casino revenues), we will include only those studies in which the amount received varies by less than 10% during the study period (i.e., the lowest amount received by each recipient must be at least 90% of the highest amount received).
One form of GBI is a negative income tax (NIT), whereby people whose income is below their tax liability threshold would receive an amount from the government based on a prescribed tax rate. For example, if a person's employment income was $20,000 per year and they would have to pay tax on income over $30,000, then the $10,000 difference would be subject to a "negative tax" such that the government would pay some amount of money to this person.
If the tax rate was 50%, this person would receive $5000 per year as the NIT benefit, resulting in a total income of $25,000. If on the other hand, the person had no income at all, the NIT benefit would be $15,000, so the person's total income would never fall below this amount and additional income would be subject to the NIT tax rate.
In contrast, welfare benefits are cut dollar for dollar if the recipient earns more income, so there is less incentive for recipients to seek low-paying jobs.
Some other forms of GBI also have a "take-back" condition in the intervention whereby the benefit is reduced at a known, prescribed rate when there is additional income from employment or other sources; however, the benefit must include a minimum guaranteed amount that is paid unconditionally (i.e., not affected by changes in income or employment status). This guaranteed amount will serve to differentiate studies of GBI included in this review from those of existing social assistance programs, including those with "soft" (minimal) eligibility criteria.
In summary, we will include interventions that meet the following criteria: (1) regular payment intervals, (2) paid in cash (not in-kind), (3) a guaranteed minimum amount received unconditionally, and (4) fixed (within 10%) or predictable amounts.

| A note on means testing
In this review, we distinguish between means testing that is used to determine eligibility for social assistance programs, versus means testing that is used to recruit participants for a GBI program, pilot, or experiment. For social assistance, means testing is conducted on an ongoing basis, to monitor eligibility and to adjust the amount of the benefit if required (e.g., reducing the benefit amount if employment income increases). We will include studies of GBI interventions if participants are enrolled based on low income, unemployment, or other means-related factors, but not if the amount of the benefit is adjusted periodically based on those factors, with a dollar-for-dollar withdrawal rate in the benefit amount, as this would be similar to how conventional social assistance programs are administered.
Similarly, we will exclude studies of interventions that involve ongoing means testing to reassess eligibility based on changes in the participants' financial circumstances.

| How the intervention might work
Proponents of GBI suggest that it is a preferable way to relieve poverty than conventional welfare programs for several reasons: 1. GBI would avoid the stigmatization inherent in conditional, means-tested programs by offering the benefit to everyone within a community or at least everyone below a certain income threshold (Gentilini et al., 2020;Jenkins, 2019).

The means testing of applicants and scrutiny of recipients in
welfare programs is labor-intensive to conduct; these procedures are not necessary with GBI. Thus, it would be a more efficient method of poverty reduction (Widerquist et al., 2013;Yang et al., 2021).
3. GBI is a matter of social justice which addresses growing income inequality and fosters a fairer sharing of the public wealth accumulated over successive generations (Gentilini et al., 2020;Standing, 2021).
One drawback of welfare programs is that not everyone who is eligible ends up receiving the benefit. Many people do not apply for assistance because of the stigma and shame associated with welfare, while others may not realize they are eligible because of the complex requirements and procedures for enrollment (Bidadanure, 2019;Gentilini et al., 2020). Alternatively, because government programs are often targeted toward specific populations (e.g., families with children), some people do not qualify for assistance (Koebel & Pohler, 2019). Because everyone in the community would be eligible for GBI, or those people under some income threshold, these problems would be avoided, as everyone with a low income would be able to receive the benefit. ownership of durable goods such as houses and cars, or access to credit." As such, this review will examine studies of GBI interventions that use alternative measures, as described above, to assess their effectiveness for poverty reduction.
Food security was one outcome in a study of the Ontario Basic Income Pilot (OBIP) in Canada in 2018-2019, which provided a payment to recipients equal to 75% of the official poverty line, more generous than existing social assistance amounts. Over two thirds of the respondents in the study reported that their diet had improved, they skipped meals less often, ate more nutritious food, and accessed

| Why it is important to do this review
We found the following reviews that included GBI-like interventions in high-income countries, including one review of other reviews: • Hasdell (Hasdell, 2020) conducted a synthesis of reviews, published between 2011 and 2020, of interventions globally that included at least two features of UBI. Three reviews for low-and middle-income countries were included that reported on food insecurity or material deprivation. The current review will be the first to quantitatively evaluate the effectiveness of various forms of GBI for reducing poverty in highincome countries, using food security level, consumption, material deprivation and multi-dimensional poverty indicators as primary outcomes. Although other reviews have included outcomes related to various dimensions of poverty, this review will synthesize findings related to all relevant material, social, and psychological outcomes according to current multi-dimensional conceptualizations of poverty.

| Policy relevance
Although guaranteed basic income as it is thought of today was first proposed by Thomas Paine in the 18th century, there has been a resurgence of support for GBI in recent decades by advocates in various fields: philosophy, economics, social policy, high-tech, and notably, from opposing points on the political spectrum (Alston, 2017). However, a major obstacle to constructive policy debates on GBI is that the theoretical conceptualizations of basic income-usually the universal variety-do not quite align with the ways in which GBI programs, pilots, and experiments have been implemented in practice (Gentilini, 2020;Yang, 2021). The disassociation between theoretical conceptualizations and the actual designs of empirical GBI interventions, as well as the heterogeneity of these designs, makes it difficult to agree on principles to guide the development of full-scale GBI programs (Gentilini, 2020;Yang, 2021). Because empirical GBI interventions only include some features of a true UBI and often enroll participants based on having income below some threshold, there is also ambiguity between the definitions of these interventions and those of liberal welfare programs. As well, the roles of various stakeholders-researchers, politicians, communities, news mediagive rise to competing expectations which may result in misperceptions of the findings of GBI studies (Merrill, 2022). For these reasons, this review will attempt to develop a framework or rubric to facilitate the evaluation and comparison of various types of GBI interventions, so that empirical evidence can be more objectively assessed and synthesized and be more useful for policy discussions.
The inclusion of alternative and novel poverty measures in this review will also be relevant to public and social policy, particularly with respect to health and healthcare. The association between poverty and poor physical and mental health has been well documented (Boozary & Shojania, 2018;Gundersen & Ziliak, 2015;McLeod & Veall, 2006;Seligman & Schillinger, 2010 while productivity has continued to grow at the same pace, increasing by 62% between 1980 and 2020, wages have only increased by 17.5% in these four decades. Over the same time, the income gap between the rich and the poor has grown much wider: household income for the lowest quintile, adjusted for inflation, remained essentially unchanged between 1973 and 2015, whereas for the wealthiest 5% it increased by 60% (Stone et al., 2020). This suggests that most of the wealth generated by the increased productivity during recent decades has gone to the rich. The reasons for this include labor laws that favor corporations over unions, decreasing tax rates for the wealthy, and small increases in the minimum wage which have not kept pace with inflation (EPI, 2021). These factors, combined with workfare programs placing more people into low-paying jobs, have resulted in increasing numbers of the "working poor".

| OBJECTIVES
This systematic review will aim to appraise and synthesize the available quantitative evidence on GBI interventions in high-income countries, for the purpose of comparing the relative effectiveness of specific forms of GBI for alleviating poverty. As such, we will seek to answer the following research questions: • What are the effects of various forms of a guaranteed basic income (GBI) on poverty and food security in high-income countries?
• Is there sufficient evidence available to determine a minimum amount of GBI to effect significant reductions in poverty?
• How do estimated effect sizes vary with the type of poverty measure used (income based, consumption based, multidimensional measures, and food security level)?
• What is the relationship between the various measures of poverty (i.e., which ones predict similar effects across different types of interventions)?

| METHODS
We will conduct and report this review according to the Methodological Expectations of Campbell Collaboration Intervention Reviews (MECCIR) guidelines (Methods Group, 2019a, 2019b). Due to the relevance of the review topic to societal equity, we will also follow the PRISMA-Equity reporting guideline (Welch et al., 2012). As well, we will consult the AMSTAR 2 critical appraisal instrument (Shea et al., 2017), intended to assist policymakers in assessing the quality of systematic reviews, to ensure that this review clearly addresses all the relevant criteria.

| Types of studies
The review will include primary studies that collect and analyze quantitative data on poverty-related effects of GBI interventions. We will exclude qualitative studies (e.g., case reports, narrative reports of interviews or focus groups) as well as any literature that refers to primary research reports or findings, such as reviews and compilations of studies, books, news and magazine articles, editorials, opinion pieces, or blogs.
We will include quantitative studies with any of the following designs: • Randomized controlled trial (RCT) • Cluster randomized controlled trial (cRCT) • Controlled before and after (CBA) • Regression discontinuity design (RDD) • Interrupted time series (ITS) with at least three time points before, three time points after, and a time-series analysis • Cohort (prospective or retrospective, including cross-sectional) with or without a control group, and with at least two repeated outcome measures Cross-sectional studies using data from a single time point will be excluded as they do not examine change over time in a particular cohort.
We will include all longitudinal quasi-experimental designs even if they lack statistical controls; however, the more rigorous designs RIZVI ET AL.
will likely be deemed to have higher internal validity, based on the risk-of-bias assessments.

| Types of participants
We will include studies that involve any group of people in developed high-income countries (defined under "Types of settings" below).
Children will be included since some studies examine outcomes for the children of parents or guardians who receive GBI benefits.

| Types of interventions
We will include any cash transfer programs for adults (18+ years old) in high-income countries that meet our four criteria for GBI interventions: (1) regular payment intervals, (2) paid in cash (not inkind), (3) a guaranteed minimum amount received unconditionally, and (4) fixed (within 10%) or predictable amounts.
Refundable/payable tax credits will be excluded because they are either small in amount (i.e., not enough to provide an income "base") or they are conditional (e.g., being employed, enrolled in a training program, having children of a certain age, caring for adults, or having a disability).
GBI benefits can be paid on an individual or household basis. The interventions can be administered by governments (usually as pilot projects) or by non-governmental or civil society organizations for research purposes. In studies that include control groups, usual care would be in the form of conventional government assistance programs for participants who are eligible to receive them or no government assistance for those who aren't.

| Types of outcome measures
Below are descriptions of the outcomes of interest for this review.
We will not exclude studies on the basis of outcome measures, as some studies may report other poverty-related outcomes that are important to include in this review.
The primary outcome of food security level is typically assessed For the secondary outcomes, all measures below will be included.
Some outcomes such as weight and height measures, used to determine body mass index (BMI), will be measured using instruments or self-reporting, while other outcomes such as self-reported health status will be measured using validated scales (e.g., the SF-12 Survey for physical and mental health). Some secondary outcomes may be individual components of poverty indicators (e.g., food expenditure would be a component of a consumption measure).

Primary outcomes
• Food security level (using survey-based, validated measures, as described above) • Poverty level assessed using instruments intended or designed to

| Search methods for identification of studies
This review will focus on studies that investigate GBI programs and initiatives implemented in developed high-income countries. The search strategy proposed for this review builds on those used in previous reviews on GBI (Gibson, 2020;Pinto, 2021). Searches using both keywords and database-specific controlled vocabulary will be conducted in relevant databases, and complementary searches will be done to identify additional studies as well as pertinent gray literature.

| Electronic searches
Searches will be conducted in subject-specific and multidisciplinary databases to identify relevant published studies to include in this review. Searches will be executed by PRL in the following databases Database limits will not be used, and no restrictions related to languages, dates, or publication types will be imposed when searching the above resources.
An initial, sensitive search strategy was developed for MEDLINE (Ovid). Given the scope of this review, the research librarian and principal investigator determined that searching broadly for studies related to GBI would suffice and that no additional concepts would be included and combined. To assess its effectiveness, the strategy was peer-reviewed by another research librarian following the Peer Review of Electronic Search Strategies (PRESS) guideline for systematic reviews (McGowan, 2016). This search strategy will then be translated for the other databases using pertinent subject headings, where applicable, as well as appropriate search syntax.
The MEDLINE strategy is available in Supporting Information: Appendix 1.
In addition to using the above resources, searches will be done in the Cochrane Database of Systematic Reviews (Ovid), the Campbell Systematic Reviews journal (Wiley), the Social Systems Evidence database (McMaster University), and Epistemonikos (via the Cochrane Library) to identify relevant review articles.
Included studies will be added to Zotero, which integrates notifications from Retraction Watch, to determine if any of them have been retracted. Each included study will also be accessed on its original publisher platform to verify whether any corrections or updates were made since the original text was published.

| Searching other resources
Various approaches will be used to identify relevant gray literature.
To In addition to searching for gray literature, other means of identifying studies will be used and are described below.
Reference lists from relevant knowledge syntheses (systematic and non-systematic reviews) as well as those from included primary studies will be examined to see if other studies should be considered.
Citation searching of included articles will also be conducted using Google Scholar (https://scholar.google.com/).
Once title and abstract screening is complete, journal titles of references eligible for full-text review will be analyzed to select the five journals that appear most frequently. These journals will then be RIZVI ET AL.
hand searched by looking specifically at each one's table of contents for the past 5 years.
The corresponding authors of included studies will be contacted by email and will be provided with a list of included articles along with the inclusion criteria for the review. They will be asked if they are familiar with any additional studies that might be relevant. In addition, authors of conference presentations will be contacted to see if their research has been published as articles or if they have data or results that they are willing to share or that will be published in the near future.

| Data collection and analysis
3.3.1 | Description of methods used in primary research GBI interventions (programs, experiments, and pilot studies) have typically been carried out within selected geographic regions with participants whose income falls below a certain threshold amount.
Some studies employ a saturation approach where every eligible person in the community who enrolls receives the benefit, so that community-level effects can be examined. Although the types of outcomes are numerous, data are usually collected using surveys, and sometimes analyzed by incorporating administrative data.
Some basic income experiments are conducted as randomized controlled trials (RCTs) with intervention and control groups, while others are of a quasi-experimental (observational) nature, some using statistical controls such as propensity score matching to reduce bias.

| Selection of studies
All stages of screening references will be conducted with the use of Covidence, an online tool designed to streamline certain stages of review projects (https://www.covidence.org/). A summary of the inclusion and exclusion criteria (see Supporting Information: Appendix 2) will be posted on Covidence for reference. The selection of studies will begin with title and abstract screening, performed independently whereby each reference will be seen by two reviewers. In case of disagreement, the decision on including the reference will be made by the principal investigator. The same process will be used at the full-text screening stage to determine the eligibility of the references which are retained after title and abstract screening. The reasons for excluding references at this stage will be recorded and presented in a PRISMA flow diagram.
Both screening phases will be subject to a pilot to ensure that the inclusion and exclusion criteria are applied consistently by all reviewers. Twenty-five randomly selected references will be used for the title and abstract pilot, and ten randomly selected references will be used for the full-text pilot. Reviewers will be able to provide feedback in the pilot forms at both stages regarding the clarity of the inclusion and exclusion criteria, and we will refine the wording based on the feedback if more than one reviewer expresses the same concern.
3.3.3 | Data extraction and management Data will be extracted by two reviewers working independently, using an extraction form in Excel (Microsoft Corporation, 2022), based on the coding template in Supporting Information: Appendix 3.
The form will be piloted with ten articles on studies with diverse designs and outcomes, to check if more questions or categories are required in the form to capture all relevant information on the population, setting, study design, intervention, data collection and analysis, outcomes, and results. Because of the time-intensive nature of data extraction, we will attempt to resolve discrepancies by consensus between the two reviewers and by consulting a third reviewer if consensus is not reached.
If there are several articles included that report on the same study, two reviewers will perform the extractions using one form each, to consolidate the data reported in these articles. If there are discrepancies in the data reported in different articles on the same study, we will contact the study authors to ask for clarification. If we cannot reach the authors, we will use the data from the articles that present the most complete datasets and statistical analyses.
For multi-arm studies, we will only include the intervention and control groups that meet our inclusion criteria. However, we will note the presence of the other groups in the ' Table of characteristics of included studies'.
Based on our preliminary literature review, we do not expect to find crossover designs for GBI interventions, but if we do, we will use data only from the first stage of the study (i.e., before participants are moved into a different study arm) to avoid carry-over effects.

| Assessment of risk of bias in included studies
We will use the risk of bias tool described by Sharma Waddington and Cairncross (2021), which builds on previously developed tools Higgins Julian et al., 2016;Hombrados & Waddington, 2012;Jimenez et al., 2018;Sterne et al., 2016;Waddington et al., 2017), and combines scoring criteria for randomized and non-randomized designs so that the quality of studies using both designs can be compared. Risk of bias will be assessed independently by two reviewers, and ratings of "low risk," "some concerns" or "high risk" will be assigned for each of eight domains: Confounding, Selection bias, Attrition bias, Motivation bias, Performance bias, Measurement error, Analysis reporting, and Unit of analysis error (Sharma & Cairncross, 2021). We will attempt to resolve discrepancies by consensus, but if it is not reached, the higher risk of bias rating of the two reviewers will be used (i.e., the more cautious rating). To calculate an overall risk of bias score, we will convert the ratings to numerical values (0 = low risk, 1 = some concerns, 2 = high risk) and then sum up the values to get an overall score between 0 (low risk in every domain) and 16 (high risk in every domain).

| Measures of treatment effect
For continuous outcomes, we will calculate standardized mean differences (d) to estimate effect sizes with 95% confidence intervals using weighted mean differences in natural (raw)   between the two regression lines at that point (Ramsay et al., 2003).
Because these effect size estimates are not based on standardized mean differences, the results of ITS studies will be analyzed separately from other study types.

| Unit of analysis issues
To minimize unit-of-analysis error, we will analyze results separately for different units of analysis: individual, household, and community.
GBI studies typically involve individual-and household-level allocation, with pre-and post-intervention measurements.
Cluster designs pose a challenge to comparing effect sizes across studies because the calculation of standardized mean differences is more difficult, due to the within-cluster and between-cluster variability (Hedges, 2007). Incorporating the total variance (withinand between-cluster) and adjusting for baseline measures of outcome variables and covariates can yield a more accurate estimate of effect size (Taylor et al., 2022). We will use a shiny app provided by Taylor and colleagues (https://airshinyapps.shinyapps.io/es_2lvl_clust_adj/) to perform the calculation of effect sizes for cluster-design studies.

| Criteria for determination of independent findings
We expect to find multiple reports of each GBI study; these will be examined as a single study, and we will use all information available.
We also expect that some of the multiple reports will be more complete in terms of describing the methodology, and some may report on the primary outcomes we will examine in this review, while others may report on secondary outcomes. If there are differences in the details between reports, we will contact the authors to verify which is correct (e.g., different numbers of participants).
For multiple outcomes in the same study which are conceptually similar, we will choose (in order of preference): (a) the outcome that was classified as a primary outcome for the study, (b) the outcome that was reported first in the abstract, or (c) the outcome that best matches our definition of the construct.

| Dealing with missing data
Studies will not be excluded on the basis of how data are reported. If included articles do not report statistical data necessary for metaanalyses and the data cannot be calculated reliably (e.g., using reported confidence intervals to calculate SDs for continuous outcomes, or using sample sizes and percentages to derive 2 × 2 tables for dichotomous outcomes), we will contact the study authors to ask for the missing data. If we cannot acquire the necessary statistic, the result will not be used in the meta-analysis, but will be included in the narrative synthesis.

| Assessment of heterogeneity
We will use the I 2 statistic, calculated using RevMan as part of the meta-analyses, to examine heterogeneity. The I 2 statistic is a percentage estimate of the variation across studies that is due to heterogeneity and not randomness. An I 2 value of 0 (zero) would mean there is no inconsistency across the studies, whereas 1 (or 100%) would mean extreme heterogeneity. Studies that markedly increase the I 2 value will be excluded from meta-analyses and will be analyzed separately (e.g., if we find that including a fifth study with four others increases I 2 from 0.30 to 0.75).

| Assessment of reporting biases
Because GBI interventions are typically conducted by governments (municipal, state/provincial, federal), we expect that reports will be RIZVI ET AL.
| 11 of 16 published both in journals and non-academic sources, and that this will not be associated with publication bias.
To determine if outcomes are selectively reported or omitted, we will compare different articles on each study, and we will also search for proposals, pre-analysis plans, and protocols, to see if they specify unreported outcomes.
We will also check for selective outcome and analysis reporting using the risk of bias tool described above.

Quantitative synthesis
Extracted data will be entered into RevMan by one reviewer, and a second reviewer will independently verify the accuracy of the entered data.
Experimental (RCT) and quasi-experimental studies will be analyzed separately.
If there is sufficient and appropriate data to conduct metaanalyses (i.e., two or more studies with the same design reporting on the same outcome), we will calculate the pooled effect size using RevMan. The I 2 statistic will be used to assess heterogeneity between studies. If the I 2 value is small (≤ 25%), indicating low heterogeneity (Higgins, 2003), we will use a fixed effects model for the metaanalysis. If I 2 is larger than 25%, indicating moderate or high heterogeneity, we will use a random effects model. For random effects meta-analyses, we will estimate the heterogeneity among the effect size parameters using the between-study variance, τ 2 , which will also be calculated using RevMan.
For multi-arm studies with a single control group, we will divide the number of participants in the control group by the number of eligible intervention groups, to prevent double counting of participants if more than one intervention group is included in the metaanalysis.

Narrative synthesis
Due to the variation across GBI interventions, study designs, populations, and outcome measures, we expect that meta-analyses will not be possible for many of the studies. In this case, we will present the findings of these studies in narrative form, including calculated effect sizes for each study. We will construct tables to classify the studies according to the type of GBI, study design, and outcomes. We will also illustrate effect sizes in graphical form for studies that can be grouped and compared in a meaningful way. To examine whether GBI interventions impact health inequities within the sample populations, we will assess the effects of GBI on physical and mental health across the sociodemographic categories of the PROGRESS-Plus framework. The PROGRESS acronym stands for place of residence, race/ethnicity, occupation, gender, religion, education, social capital, and socioeconomic status, while "Plus" refers to any other factors which may be associated with disadvantage, such as age, criminal record, disability, or sexual orientation (Kavanagh et al., 2009;O'Neill et al., 2014). Depending on the context of the study, a "Plus" factor may be the most relevant (O'Neill et al., 2014).

| Subgroup analysis and investigation of heterogeneity
If we conduct other post hoc subgroup analyses not specified above or in Supporting Information: Appendix 4 (Table of subgroup and moderator variable codes), we will report in the review that the additional analyses are post hoc and exploratory in nature.
If there are enough included studies to meaningfully compare the difference in effect across subgroups, we will conduct a metaregression to test the mean difference between the groups.
As described above, we will use the I 2 statistic calculated using RevMan to examine heterogeneity. We will investigate the reasons for heterogeneity by exploring and comparing the intervention types and contexts, populations, and outcome measures.

| Sensitivity analysis
If there are sufficient similar studies to conduct meta-analyses, we will verify the robustness of the meta-analyses by comparing the quality of the studies (as determined by our risk of bias assessments) to ensure that the effect sizes were not excessively influenced by one or more low-quality studies.

| Treatment of qualitative research
We do not plan to include qualitative research.
3.3.15 | Summary of findings and assessment of the certainty of the evidence We will present a GRADE "summary of findings" table and will assess the certainty of the evidence, following the method of Schünemann and colleagues (Schünemann et al., 2019). Separate tables will be presented for each type of GBI intervention (e.g., subsistence-level benefits for households, monthly amount below €500 for individuals), and the tables will include all the primary and secondary outcomes (listed above) for which results are reported in the included studies.
Two reviewers will independently apply the GRADE approach to assign for each outcome an overall level of the quality of evidence-that is, our level of certainty that the estimate of the effect is close to the true effect. The quality of the evidence will be ranked as "high," "moderate," low, or "very low." If there is a high degree of heterogeneity among studies so that we cannot pool the results, we will apply the GRADE approach to assess the certainty of evidence and will present a narrative summary of the effect.

DECLARATIONS OF INTEREST
VW is editor-in-chief and acting CEO of the Campbell Collaboration.
VW will not be involved in the editorial decision process for this review.
MG led a scoping review of public health effects of interventions similar to basic income, published in 2020 in Lancet Public Health.
The authors of this review declare no conflicts of interest.

PRELIMINARY TIMEFRAME
The approximate date for submission of the systematic review is December 2022.

PLANS FOR UPDATING THIS REVIEW
We plan to update this review 4 years after publication of the review.
If this is not possible for some reason, the lead author will communicate this to the Social Welfare Coordinating Group.

Internal sources
• No sources of support provided

External sources
• No sources of support provided